Conversation
There was a problem hiding this comment.
Code Review
This pull request primarily focuses on updating the project's dependencies, versioning, and migrating various code examples and documentation to use the Qwen3_5Template instead of the generic Template. It also refactors Dockerfile package installations, integrates Megatron installations directly, and clarifies LoRA weight synchronization behavior in vLLM. Review comments highlight redundant package installations in the Dockerfile, a full-width character in English documentation, a typo in Chinese documentation, and a less deterministic version constraint for the datasets package.
| RUN pip install numpy==2.2 --no-cache-dir | ||
|
|
||
| # Install tinker, ray, and other deps | ||
| RUN pip install --no-cache-dir tinker==0.14.0 "ray[serve]" transformers peft accelerate -U |
There was a problem hiding this comment.
|
|
||
| Currently, the model-template mapping is simple: | ||
|
|
||
| - Template class:Supported in all pure text LLMs. |
There was a problem hiding this comment.
| - **HCCLCheckpointEngine**: 适用于昇腾 NPU 环境 | ||
|
|
||
| > 检查点引擎是 RLHF 训练基础设施的关键组件,确保训练器和采样器使用一致的模型权重。 | ||
| > 目前的同步分为merge_and_sync=True/False两种情况,为True时将lora合并仅基模并同步,为False时仅同步lora权重。另外,多租户直接附加lora文件到vLLM上,在merge_and_sync=False,或使用多租户时, |
There was a problem hiding this comment.
Typo: 合并仅 should be 合并进 (merged into).
| > 目前的同步分为merge_and_sync=True/False两种情况,为True时将lora合并仅基模并同步,为False时仅同步lora权重。另外,多租户直接附加lora文件到vLLM上,在merge_and_sync=False,或使用多租户时, | |
| > 目前的同步分为merge_and_sync=True/False两种情况,为True时将lora合并进基模并同步,为False时仅同步lora权重。另外,多租户直接附加lora文件到vLLM上,在merge_and_sync=False,或使用多租户时, |
| dependencies = [ | ||
| "numpy>=2.0.0,<2.3.0", | ||
| "datasets>=3.0,<4.0", | ||
| "datasets", |
There was a problem hiding this comment.
PR type
PR information
Write the detail information belongs to this PR.
Experiment results
Paste your experiment result here(if needed).